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Abstract:  We  provide  the  M-theory  for  the  standard  multivariate  linear  model  Y  =  XR  +  E.  where  Y  is  n  x  p 
matrix  of  observations.  X  is  n  x  in  design  matrix.  B  is  m  x  p  matrix  of  unknown  parameters  and  E  is  n  x  p 
matrix  of  errors  with  the  row  vectors  independently  distributed.  Two  test  criteria  based  on  the  roots  of 
delerminantal  equations  are  proposed  for  testing  iinear  hypotheses  of  the  form  P’B  =  C„.  where  P  is  a  matrix 
of  rank  </.  The  tests  are  similar  to  those  considered  in  MANOVA  using  least  squares  techniques.  One  of  them 
is  the  Wald  type  statistic  and  another  is  the  Rao's  score  type  statistic.  The  asymptotic  distributions  of  these  test 
statistics  are  derived.  Consistent  estimates  of  nuisance  parameters  are  obtained  for  use  in  computing  the  test 
statistics. 

The  M-method  of  estimation  considered  is  the  minimization  of  where  p  is  a  convex  function  and  e, 

is  the  i-th  row  vector  in  (  Y-XB ).  All  results  are  derived  under  a  minimal  set  of  conditions. 

1  MS  Subject  Classification:  62HI5.  62HI0. 

Key  words  and  phrases:  MANOVA;  M-estimation;  Rao's  score  lest;  roots  of  determinantal  equation;  Wald 
test. 


1.  Introduction 

In  a  recent  paper  Bai,  Rao  and  Wu  (1992)  considered  the  problem  of  estimation 
and  testing  under  the  M-theory  for  the  model 

Yi  =  Xffi+e„  (1.1) 

Correspondence  to:  Prof.  C.R.  Rao.  Dept,  of  Statistics.  Centre  for  Multivariate  Analysis.  The  Pennsylvania 
State  University.  417  Classroom  Building,  University  Park.  PA  16802.  USA. 

Research  sponsored  by  the  Air  Force  Office  of  Scientific  Research  under  Grant  AFOSR-89-0279  and  the 
U  S.  Army  Research  Office  under  Grant  DAAL03-89-K-0139. 
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where  Y ,  is  a  p-vector  of  observations,  e,  is  a  p-vector  of  errors,  { X , }  is  a  design  se¬ 
quence  of  m  xp  matrices  and  0  is  an  w-vector  of  unknown  parameters.  The  discus¬ 
sion  was  confined  to  estimation  of  0  by  minimizing 

i  Q(Y^X;p)  (1.2) 

i  ■--- 1 

choosing  any  convex  function  g.  The  asymptotic  distribution  of  0,  the  estimate  so 
obtained,  was  derived.  For  testing  the  hypothesis  P'0  =  Cn,  the  test  criterion  pro¬ 
posed  was  the  likelihood  ratio  type 

min  Y,Q(Yi-X'0)-n\\n  I  p( T,  -  X'0),  (1.3) 

P‘(I  =  C„  p 

which,  under  suitable  normalization,  has  an  asymptotic  distribution  which  is  a  mix¬ 
ture  of  chi-squares. 

We  now  consider  a  special  case  of  (1.1),  the  standard  multivariate  linear  model 

Yj  =  B’Xj  +  Sj,  (1.4) 

where  Yj  and  e,  are  as  in  the  model  (1.1),  Bis  anmxp  matrix  of  regression  coeffi¬ 
cients  and  {X,}  is  a  design  sequence  of  m-vectors.  As  in  (1.2),  we  estimate  B  by 
minimizing 

ZqW-B'X,),  (1.5) 

i  =  i 

where  g  is  a  covex  function,  and  develop  MANOVA  type  analysis  leading  to  test 
criteria  based  on  the  roots  of  a  determinantal  equation  for  testing  hypotheses  of  the 
type  P  B  =  C0,  where  P  is  mxq  matrix  of  rank  q. 


2.  Notations  and  assumptions 


Let  i(/(u)  be  a  choice  of  a  subgradient  of  Q  at  u  =  (uu  ...,upY .  [A  p-vector  i//(u) 
is  said  to  be  a  subgradient  of  g  at  u,  if  p(z)>p(m)  +  U-m)>(m)  Vze/?''.]  Note 
that  if  g  is  differentiable  at  u  according  to  the  usual  definition,  g  has  a  unique 
subgradient  at  u  and  vice-versa.  In  this  case 


V/(w)  =  Fp(«)  = 


/  dg 

U«i’ 


dg  V 

dup) 


Denote  by  S>  the  set  of  points  where  g  is  not  differentiable.  This  is,  in  fact,  the  set 
of  points  where  \f/  is  discontinuous,  which  is  the  same  for  all  choices  of  iy.  It  is  well- 
known  that  ®  is  topologically  a  Fa  set  of  Lebesgue  measure  zero  (ref.  Rockafeller 
(1970),  p.  218  and  Section  25). 

We  assume  that  i //(u)  is  measurable  and  make  the  following  assumptions  as  in 
Bai,  Rao  and  Wu  (1992): 
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(A,)  The  common  distribution  function  Fof  e,  satisfies  F(QD)  =  0.  (This  ensures  that 
certain  functionals  of  ^  which  appear  in  our  discussion  have  unique  values.) 

(A2)  Etf/(e !  +  m)  =  /1«  +  o(Su|)  as  1m||  -*0,  where  A  >0  is  a  pxp  constant  matrix. 
(A3)  £|v/(e!  +  u)S2  is  finite  for  small  Juj  and  is  continuous  at  «  =  0  as  a  function 
of  u. 

(A4)  E[ti/(Ex)][ii/(£  i)]'=r>0. 

(A5)  sn=i  X.X/X), 

i  =  1 

and 

dl  -  max  XlSn'Xj-*  0  as  n  ->  oo. 

1 

We  denote  by  B  and  B  any  values  of  B  which  minimize 

ieW-B'X.) 

(=i 

respectively  without  any  restriction  and  subject  to  the  restriction 

P'B  =  C0  (2.1) 

specified  as  a  hypothesis,  where  P  is  a  mxq  matrix  of  rank  q.  Further  let 

Z(B)=Z  XiMYi-B'XM  (2.2) 

i=i 

which  is  an  mxp  matrix. 

For  testing  the  hypothesis  P'B  =  C0,  we  propose  two  alternative  test  criteria.  One 
is  based  on  the  roots  of  the  determinantal  equation 

\W„-dA-'rA-'\=0,  (2.3) 

where 

Wn  =  (P'B-C0y(P'Sn-lP)-'(P'B-C0)  (2.4) 

is  the  Wald  type  statistic,  and  (A,f)  is  a  consistent  estimate  of  ( A,T ),  the  matrix 

parameters  defined  in  assumptions  (A2)  and  (A4)  respectively.  In  Section  5  of  this 
paper,  we  discuss  the  estimation  of  (A,T).  Another  test  is  based  on  the  roots  of  the 
determinantal  equation 

|/?n-^i=0,  (2.5) 

where 

R„  =  aSYS^HS)  (2.6) 

is  the  Rao’s  score  type  statistic  (see  Rao  (1948)),  and  Tis  a  consistent  estimate  of 

r.  The  asymptotic  distribution  of  the  roots  of  (2.3)  or  (2.5)  is  the  same  as  that  in 
the  normal  theory,  and  hence  the  tests  proposed  by  Fisher  and  Hsu  (sec  for  instance 
Rao  (1973,  pp.  556-560))  can  be  used. 

It  may  be  noted  that  tests  of  the  above  type  were  suggested  by  Sen  (1982)  and 


-•U  -V  4. 


7? 
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Singer  and  Sen  (1985)  in  the  multivariate  situation  under  methods  of  M-estimation 
and  assumptions  different  from  ours,  and  by  Schrader  and  Hettmansperger  (1980) 
in  the  univariate  case.  Some  papers  of  related  interest  are  by  Inagaki  (1973),  Heiler 
and  Willers  (1988)  and  JureCkova  (1983).  It  may  be  seen  that  our  conditions  are 
somewhat  simpler  in  view  of  the  convexity  of  the  loss  function. 

In  Section  3,  we  state  the  main  theorems  and  in  Section  4,  we  provide  proofs 
under  what  we  believe  to  be  a  minimal  set  of  conditions.  A  new  feature  of  the  paper 
is  the  discussion  on  consistent  estimation  of  the  nuisance  parameters  A  and  r 
without  making  any  further  assumptions  on  ijj. 

The  results  of  the  paper  could  be  extended  to  other  methods  of  M-estimation  such 
as  those  with  scale  invariance  or  those  based  on  estimating  equations  only.  But  they 
seem  to  need  heavy  assumptions  for  a  rigorous  treatment.  It  would  also  be  of  some 
interest  to  consider  rates  of  convergence  and  related  problems.  We  hope  to  consider 
such  problems  in  future  research. 


3.  The  main  theorems 

For  convenience,  we  write 

xin  =  sn~w2xh  p;  =  (P'Sn-'pyw2P'sn-w\  (3.1) 

so  that 

i  xinx;n  =  /m,  P'npn  =  lq,  (3.2) 

1=  1 

U'  =  A~li  v(e,)Xf„Pn  =  (Mm, ...,uqn),  (3.3) 

/=  i 

K  =  i  ¥(e i)X;nPn  =  (u,„, ... ,  vq„).  (3.4) 

i=i 

We  also  consider  a  sequence  of  alternatives  to  the  null  hypothesis  P'B  =  C0 

H„:  P'(B~B0)  =  P'An,  (3.5) 

where  B0  and  An  are  known  mxp  matrices  such  that 

P'B0  =  C0  and  ||Sj/2dnS  =  0(1),  (3.6) 

and  denote 

0„  =  P;sy2An  =  (P'S~lP)~,/2P'A„.  (3.7) 

It  is  easily  seen  that  u ...,uqn  are  asymptotically  independent  with  the  common 
limiting  distribution  Np(0,/P‘r/1  ~‘),  so  that  the  limiting  distribution  of  U’nU„  is 
central  Wishart  on  q  degrees  of  freedom,  Wp(q,A~xrA^x).  Similarly  v)n,....vqn  are 
asymptotically  independent  with  the  common  limiting  distribution  N(0,/'),  so  that 
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the  limiting  distribution  of  V^V„  is  central  Wishart  on  q  degrees  of  freedom, 
Wp(q,D. 

We  have  the  following  theorems  concerning  the  asymptotic  distributions  of  Wn 
and  Rn  under  the  null  hypothesis  and  also  under  the  sequence  of  alternative  hypo¬ 
theses  (3.5). 

Theorem  3.1.  Assume  that  under  the  model  (1.4),  the  assumptions  (A,)-(A5)  and 
condition  (3.6)  on  the  sequence  of  alternative  hypotheses  hold.  Then 

Wn  =  (Un  +  0n)\Un  +  0n)  +  op(\)  as  (3.8) 

Especially,  if  the  null  hypothesis  H0  holds  or  \Sf2An  \  -»  0  as  n->oo,  then  the 
asymptotic  distribution  of  W„  is  the  central  Wishart ,  Wp(q,AxTA  ”*).  If  the  local 
alternatives  0n  =  (P'S~1P)~  l/2P'An  has  a  limit  0*0  as  n  -»  oo,  then  the  asymptotic 
distribution  of  W„  is  the  noncentral  Wishart,  Wp(q,A~x TA~\0'0).  [See  Rao 
(1973,  p.  534)  for  the  definition  of  noncentral  Wishart  distribution .] 

Theorem  3.2.  Suppose  that  under  the  model  (1.4),  the  assumptions  (A,  )-(As)  are 
satisfied,  and  condition  (3.6)  holds.  Then 

Rn  =  {Vn  +  0nAnVn  +  0nA)  +  op(l)  asn-^co.  (3.9) 

Especially,  if  H0  holds  or  |5„I/2J„|  -» 0  as  n  ->  oo,  the  asymptotic  distribution  of  R„ 
is  the  central  Wishart,  Wp(q,T).  If  the  local  alternatives  0n  has  the  limit  0*0, 
then  the  asymptotic  distribution  of  R„  is  the  noncentral  Wishart,  Wp(q,  T,A0’0A ). 

Note:  The  test  based  on  W„  involves  two  nuisance  matrix  parameters  rand  A, 
both  of  which  need  to  be  estimated  for  computing  the  test  criteria.  On  the  other 
hand,  the  test  based  on  R„  involves  only  the  nuisance  matrix  parameter  T,  which 
only  needs  to  be  estimated.  Further,  the  local  power  for  the  sequence  of  alternatives 
considered  depends  on  the  magnitude  of  the  roots  of  the  equation 

\0'0-kA-'rA~l\  =0  (3.10) 

for  the  test  based  on  W„  and  on  the  roots  of  the  equation 

\W0A-AEI  =0  (3.11) 

for  the  test  based  on  R„.  Since  the  roots  of  (3.10)  and  (3.1 1)  are  the  same,  the  two 
alternative  tests  are  equally  efficient  asymptotically.  In  such  a  case,  the  test  based 
on  Rn  may  be  preferred  to  that  based  on  W„  as  only  T  has  to  be  estimated  in  the 
former  case  and  both  rand  A  have  to  be  estimated  in  the  latter  case.  This  statement 
is  true  only  for  large  samples.  The  relative  merits  of  these  two  tests  remain  to  be 
investigated  in  small  samples. 


A.-C* * v. --.  f  Jk'-.f,  .  . 
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4.  Proof  of  the  main  theorems 

In  the  following,  for  a  set  A,  1(A)  denotes  its  indicator  function.  We  write 

B*  =  vec  B  =  vec(/J,:—:/?p)  =  (fi,  (4.1) 

and  the  same  notation  applies  to  other  matrices. 

To  prove  the  theorems  stated  in  Section  3,  we  need  some  lemmas.  Without  loss 
of  generality,  we  assume  that  Co  =  0  in  (2.1),  i.e.,  B0  =  0  in  (3.5).  There  exists  an 
mx(ni-q)  matrix  K  of  rank  m-q  such  that 

P'K  =  0.  (4.2) 

Without  loss  of  generality  we  can  assume  that  K' A„  =  0.  The  hypotheses  H0  and  H„ 
can  be  written  as 

H0:  B  =  KMq  for  some  (m-q)xp  matrix  M0, 

H„:  B  =  KM0  +  An. 

Define 

P'n  =  (P'S~'py]/2P'S~u2,  Kn  =  Sy2K(K’SnK)-u2.  (4.3) 

Then 

FnX„  ~  BnPn  =  /^,  FnKn  =  0.  (4.4) 

If  H„  holds,  the  model  (1.4)  can  be  rewritten  as 

y,  =  (4.5) 

where  Xin  =  S~ 1 11 X, ,  as  defined  in  (3.1), 

B„  =  XnM„  +  Pn&„,  (4.6) 

Mn  =  (K'SnK)u2M0  +  K'XnA„,  (4.7) 

and  6>„  is  defined  by  (3.7). 

Put  M0n  =  (K'SnK)W2M0.  The  model  (1.4)  under  H0  has  the  form 

Yi  =  (KnM0nyXin  +  ei.  (4.8) 

Denote  by  M„  the  M-estimate  of  M0„,  i.e.,  M„  is  such  that 

t  g(Yi-M^K;Xin)=  min  t  e(Y,-M'K'Xin).  (4.9) 

/=1  M:(m-q)xpi=  i 

Note  that  the  restricted  M-estimate  of  B„  is  B„  =  Sy2B  —  K„M„. 

Lemma  4.1.  Suppose  that  (Aj  )-(A5),  (3.5)  and  (3.6)  are  satisfied.  For  any  constant 
c>0  we  have 

sup  i  { y(Y,  -  G'K'M-  y(Y,  -  B’nXm)\  0  Xin 
/= ! 
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+  (yl  ®  Kn)(G*-  M*)  -  (/I  ®  P„)0*  0  inpr.,  (4.10) 

and 

sup  £  {g(Yl-G'K;Xin)-q(Y-B'nXm)\ 

1G-A/„K‘-  i  =  l 

+fn(&*)  +  g„(G*~  M*)  -0  inpr.,  (4.11) 

where 

/„(©„*)  =  e?  £  <//<£, )®  (p;xin)  -  W*'{A  ®)  /,)©*,  (4.12) 

i 

gn(G*-Mn*)  =  ( G*-M*Y  £  (K'XJ 

i 

-  \(G*  -  M*)'(A  ®  Im-q)(G*-M*).  (4.13) 


Note  that  (4.10)  and  (4.11)  can  be  simply  rewritten  as 

n 

sup  £  { y/(£,' -  B’Xm)  -  v/(£,)}  ®  xin  +  (A®[m)B* 

1  «Kc  /=  l 

and 

sup  £  {Q(£,-B’Xm) - <?(£,)  +  £*'(*/(£,) ®  *}„)} 

1 8|<c  »=1 

-  jfi*'(/l®/m)fl*|  -0  inpr. 


-♦0  in  pr. 

(4.10)' 


(4.11)' 


The  proofs  of  (4.11)'  and  (4.10)'  are  similar  to  those  of  Theorems  2.1  and  2.3  in 
Bai,  Rao  and  Wu  (1992).  Note  that  when  we  use  Theorem  25.7  in  Rockafellar 
(1970),  we  could  remove  the  differentiable  condition  on  {./•},  regard  Vf,{x)  as  a 
subgradient  of  f  at  x,  and  only  keep  the  differentiability  condition  on  the  limit 
function  /. 


(4.15) 
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Proof.  Write 

M  =  Mn+t 
or 

A7*  =  M*  +  £  (A~l y/(E, ■))  ®  (K’nXin). 

i=  1 

Since  M*-M*  has  an  asymptotic  normal  distribution,  we  have 

=  Op(l).  (4.16) 

By  (4.11),  it  follows  that  for  any  c> 0  and  <5>0, 

sup  7(||M-A/n|  <c)  i  {q{Y,-G'  K’nXm)-  q(Y,  -  M'K'nXm)} 

\G- M\  ~S  /=  1 

- *)  -0, 

in  pr.,  (4.17) 

and  that  for  n  large,  the  event  (]|A7- A/„|  $c)  implies  that 

inf  i  eW-G'KIXJ^  i  eWi-tt'K'M  +  k,  (4.18) 

\G-M}  =&  i=  l  /=  I 

for  some  A>0.  By  (4. 1C),  (4.18)  and  the  convexity  of  q,  we  get 

as  w  -» oo,  (4.19) 

and  (4.14)  follows. 

Proof  of  Theorem  3.1.  Without  loss  of  generality  we  assume  that  C0  =  0.  Under 
H„  we  have  from  (4.6) 

B„  =  XnMn  +  P„0„ 

(refer  to  (4.6),  (4.7)  and  (3.7)).  By  (4.15)  we  have 

Bn  =  Bn+t  Xinv'{ei)A-'  +  o„(l).  (4.20) 

1=  I 

By  (2.4)  and  (3.1), 

Wn  =  B'P(P'S-XP)-XPB  =  B'nPnP'nBn .  (4.21) 

By  (4.6),  (4.20)  and  (4.4)  we  get 

P'Bn  =  Un  +  @n  (4.22) 

and  the  theorem  follows  from  (4.21)  and  (4.22). 


Proof  of  Theorem  3.2.  Under  H„,  we  have  B„  =  K„Mn  +  Pn©„.  By  (4.14), 

||M„-MJ=(yi).  (4.23) 
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By  (4.10)  and  (4.23), 


X  { v(y,  -  B'„Xm)  -  v(£,)\  ®  Xm 

i=  I 

+  (A  ®  Kn)(M *  -  A/*)  -  (A  ®  Pri  )0*  -*0  in  pr. , 
which  implies  that 

I  (w(Y,  -  B'nXm)  -  v/(£,))  ®  (K-nXm) 

i=  l 


and 


+  (A  (g)  /,„  _  J(M* -  A/*)  -  0  in  pr. , 


I  mY,-B'n xm ) - v/(«, )) ® </>; *,„) 

i  -- 1 


-(/l  ®  /y)0„*-*O,  in  pr. 

By  (4. 14)  and  (4.25), 

i  ¥(Y,  -  fi^/„)  <g>  (K'nxm)  -  0  in  pr., 

/  =  i 


(4.24) 


(4.25) 


(4.26) 


(4.27) 


as  n~*oo.  By  (2.2),  (2.6),  (4.26)  and  (4.27),  noting  that  K„K;,  +  P„P'n- Im,  we  have 


Rn  =  (  £  ('/(K,  ~B'nXm)X’n  )(  I  XjnH,'(Yj-B'nXjn) 


i=  I 


y=i 


=  (  t  V(Y,-B'nXm)X;nKn^  i'  K^Xjn(j/(Yj  -  B'„Xjn 


+  (  I  V(Y,  -  B'nXm)X;„Pn  l  P'nXjny(Yj  -  B'Xjn) 


/=  1 


y=i 


=  (  i  w(e,)X;nPn  +  A0'n^i^  P'„Xjnv'(eJ)  +  0nA^  +  op(l),  (4.28) 


and  Theorem  3.2  is  proved  in  view  of  (3.4). 


5.  Estimation  of  the  nuisance  parameters 

In  practical  applications,  we  need  to  estimate  the  nuisance  matrix  parameters  F 
and  A.  A  natural  estimate  of  T  is 

f=n~i  t  viY.-B’X.WW-B’X,),  (5.1) 

<=  i 

where  B  is  an  M-estimate  of  B  in  the  model  (1.4).  To  estimate  A,  we  take  a  pxp 
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nonsiiiguiar  matrix  Z  consisting  of  <T| . C,p  as  its  columns,  take  h  =  /;„> 0  such  that 

h„/d..->  oc,  /;„-*()  and  lim  inf  nh},  >0,  (5.2) 

n  •  oc 

define 

'/,*  =  ViY,  -  B'X,  r  h£k )  -  Y,  -  B  X,  -  hCk ), 

/  =  1, ...,/?,  k=  1 . p ,  (5.3) 

and  use  the  pxp  matrix 

A  =  (2 nh)  1  £  [rj, . //„,]Z  1  (5.4) 

i  i 

as  an  estimate  of  A.  We  liave  the  following  theorem: 

Theorem  5.1.  Assume  that  (A , )-( A5 )  are  satisfied  in  the  model  (1.4).  Then 

t -*  r  in  pr.  as  n  -*  oo  (5.5) 

Furthermore ,  if  (5.2)  also  holds ,  then 

A  — A  in  pr.  as  n  -»  oo.  (5.6) 

Note  that  Zhao  and  Chen  (1990)  gave  a  proof  for  the  special  case  of  p  =  1. 
However,  the  proof  for  the  general  case  of  p  is  more  complicated. 

Proof.  Put  u  =  (ut,...,up)\  v  =  (v{,  ...,up)',  i//(u)  =  (i//,(u) . t pr(u)Y-  Write 

O-  {0  =  (0, ,...,6p)':  6] . 6/,=  +1}.  At  first  we  show  that,  if  vk,  $ b/2  for  some 

b> 0  and  k  =  l,...,p,  there  exists  a  constant  c>0  such  that 

c  X  8'(V'(u-b8)-ip{u))^.ipl{u  +  v)-tj/l(u) 

I  8'(i//(u  +  bB)~  t//(u))  (5.7) 

He  > 

and  similar  inequalities  hold  for  ipk{u  +  u) -  y/k(u),  k  =  2,....p. 

Note  that  0'(i//(w  +  bB)  -  ij/(u))^0  and  0'(i//(u  -  bd)  -  t//(tr))^0  for  any  be  0. 

In  fact,  by  the  cyclical  monotonicity  of  y  (refer  to  Rockafellar,  1970,  p.  238),  for 
any  0e  0  we  have 

v'i//(u)  +  (b8-  u)'i//(u  +  v)~  bd'(//(u  +  b8)^ 0  (5.8) 

which  implies  that 

(bd-u)’(y/(u  +  v)  -  [f/(u))^.bd'(y/(u  +  b6)  -  >//(,u))  (5.9) 

and 

(b0  +  v)'(tj/(u  +  v)  -  y(u))^bd'(y/(u  -  bd)-  ip(u)).  (5.10) 

For  simplicity  we  write  0  =  (0„  ...,8p_  «  =  («„  tp  =  {<//„ ...,  y/p_ ,)', 

and  sometimes  we  write  y/(u’)  for  tp(u).  Taking  8p=  1  and  -1  in  (5.9)  we  get 
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and 


(b§'  -  v ')(v(u  +  v)  -  w(u))  +  (b  -  vp)(y/p(u  +  u)  -  i//p(u)) 
<  b(6\  1  ){v(u'  +  b6\up  +  b)-v(u)} 

(bfi'  -  v')(w(u  +  c)  -  ij/(u))  -  (b  +  vp)(wp(u  +  v)  -  vp(u)) 
^b(8',-\){tt/(u'  +  b6',up- b)-  \t/(u)). 


(5.11) 


(5.12) 


Multiplying  both  sides  of  (5.12)  by  (b  -  up)/(b+  vp),  and  adding  the  inequality  so 
obtained  to  (5.1 1),  we  eliminate  yp(.u  +  v)~  >j/p(u)  from  (5.11)  and  (5.12),  and  get 

(2 b/(b  +  vp))(b8'  -  v')(v(u  +  o)~  ij/(u)) 

^b(d',  \){h/(u'  +  b(9',  1))-  w(u)}  +b(d\  +  b(d\  -1)) 

-¥(u))(b-vp)/(b  +  up).  (5.13) 

Now  it  is  not  difficult  to  get  the  second  inequality  of  (5.7)  by  using  the  elimination 
method  step  by  step.  The  first  inequality  of  (5.7)  could  be  obtained  similarly  from 
(5.10). 

Without  loss  of  generality,  we  assume  that  the  true  parameter  matrix  B  =  0  in  the 
model  (1.4).  By  (4.15)  and  B„  =  0,  we  have 

P(\\B„\\  ^dn ]/2)  -  0  as  n  -*•  oo. 


By  (A4)  and  the  strong  law  of  large  numbers, 

n 

rn  =  n-'  Y  i//(£,)v/'(£,)-*r  =  (y,m)  a.s., 
/  =  1 

as  «->o o.  Putting  f  =(y/m),  r„  =  (y$),  we  have 


(5.14) 


(5.15) 


i  ytm  -  n 


(n)\Z 


n 


n~l  X  {Vi(£i  ~  B’nXm)\iJm(Zi  -  B’nXin)  -  y//(£,Vm(£,)} 


/=  1 
rt 


1  I  (v//(£;-  B'nXJ-  wfci)Y  ■  n  1  £  yl(er- B'n  Xin) 


<  =  i 


/  =  i 


+  n  1  Y  1  Y  B'nX,n)-y;m(e,))2.  (5.16) 

i=i  i=i 

On  the  event  (\\B„\\<d^l/2),  \B'„Xin\<d\/2  for  each  /.  By  (5.7),  there  exists  a 
positive  constant  c  such  that 

n"1  1  (Vfa  ~  B'XJ  -  Wl(e,))2l(lBnl<dnl/2) 

i=  1 

^cmax|«_l  X  \\y/(ci  +  2d'n/26)-ij/(ei)\\2: 
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By  (A3)  and  (A5),  for  fixed  9  we  have 

fj*'1  I  lw(£i  +  2dy1e)-\ij(Ei)\l  =  E\v(£i  +  2dX28)-\t/(Ex)\2-*0 

(5.18) 

as  By  (5.14)  and  (5. 16)-(5. 18),  we  get 

lim  | ytm - y'") |  =  0  in  pr.,  for  l,m  =  (5.19) 

n  -*  00 

which  implies  (5.5)  in  view  of  (5.15). 

Now  we  proceed  to  prove  (5.6).  To  this  end,  we  prove  that  for  any  c> 0, 

("A)'1  I  (tMe, - B'nXm  +  h(k) - Wl(e,  +  /»C*))/(lBJ<c) - 0 

,  - 1 

in  pr.,  /=  1, (5.20) 


By  (5.7),  in  order  to  prove  (5.20),  it  is  enough  to  prove  that  for  each  fixed  6e  <9, 

T„  =  (nh)  1  £  e'{y/(ei  +  2cdne  +  hCk)--  v/(£,  +  /if*))-0 
/=  1 

in  pr.,  as  n  ->  00.  (5.21) 


By  (A3),  (A5)  and  (5.2),  we  have 

Var  Tn^(nh1)~'E[G'(v(cl  +  2cdn6  +  hi;k)-  w(£\  + hCk))]2 

<  ( nh 2) ~ ' 1  \e  ||  2£j  V(ex  +  2cdn9  +  h(k)  ~ip(£,+  Kk)\ 2  -  0.  (5.22) 

On  the  other  hand,  by  (A2),  (A5)  and  (5.2),  we  get 

ETn  =  h~l9'E(tp(£  1  +2cdnd  +  hCk)-ip(el  +  h(k))->0.  (5.23) 

By  (5.22)  and  (5.23),  we  get  (5.21)  and  (5.20).  Noting  that  ||fi„)|  =  0,,(1),  we  have 

n 

(nh)  lY,  (¥/(£/- B'nXin  +  h(k)~  V(£,  + hCk))->0  in  pr.  (5.24) 

i  =  i 

In  the  same  way, 

n 

(nh)1  £  (v(fii  ~  B'nxin~  h^k)~  h(k))-*0  in  pr.  (5.25) 

«  =  1 

By  (A,)  and  (5.2),  for  m=  1 . p. 

Var|(nh)  1  £  (Vm(c,±/i^)- y/m(e,))j 

^(nh2)  '£(v/m(£,  ±/j(*)-^m(e,))2->0  as  00. 


(5.26) 
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On  the  other  hand,  by  (A2)  and  h„~*  0,  we  have 
(2 nh)  '  Y  E(tp(e,  i- h(k)-  v(c, -h(k)) 

/=  i 

=  (2h)-[E(t//(Et+i;k)-v(£i-hCk))->A(;k,  k—  1, (5.27) 
By  (5.26)  and  (5.27),  we  have 

(2nhy'  Y  (^(e/  +  A£*)-^(£/-/iCi)) -»/!£*  in  pr.,  (5.28) 

»=  i 

for  k  =  \,...,p.  By  (5.24),  (5.25),  (5.28)  and  (5.3),  it  follows  that 

n 

AZ  =  (2nh)~]  £  fan.  —  *riiP]-+AZ  in  pr.,  (5.29) 

/  =  1 

and  (5.6)  is  obtained.  Now  Theorem  5.1  is  proved. 

Note  1.  In  estimating  A  and  proving  the  consistency  of  the  estimate,  we  have  not 
made  any  additional  assumptions  on  g.  The  only  property  used  is  its  convexity.  If, 
however,  g  is  twice  differentiable,  other  estimates  are  possible,  as  in  the  case  of  the 
least  distances  estimate  considered  by  Bai,  Chen,  Miao  and  Rao  (1990). 

Note  2.  A  referee  remarks  that  Theorem  5. 1  can  be  proved  by  applying  the  convexity 
lemma  in  a  recent  paper  by  Pollard  (1991).  It  is  true,  but  the  detailed  proof  given 
by  us  using  similar  ideas  will  be  of  help  in  solving  similar  problems.  Pollard’s  paper 
which  contains  results  similar  to  the  earlier  papers  by  Bai,  Rao  and  Yin  (1990)  and 
Chen,  Bai,  Zhao  and  Wu  (1990)  was  not  available  to  us  when  our  paper  was  submit¬ 
ted  for  publication. 

Note  added  in  proof.  This  work  can  be  extended  to  a  more  general  case  where 
g  =  g(,)-g(2>  is  a  difference  of  two  p-variate  convex  functions  £>(1)  and  gQ)  with 
y(u)  -  y/fl,(w)-  y/m(u)  being  the  difference  of  their  subgradients  at  u.  Assume  that 
{A [ )— (/I5)  are  satisfied  with  g  and  \p  in  (/l|)-(/l5)  being  replaced  by  £(l), ^<I)  and 
g{2),i//(2).  We  can  construct  the  same  test  statistics  with  g  =g0) -g{2)  and 
as  before.  It  can  be  shown  that,  if  the  above  conditions  are  met, 
Theorems  3.1,  3.2  and  5.1  are  still  valid.  In  this  context,  the  minimizer  of  the  rele¬ 
vant  function  could  be  taken  as  its  some  local  minimizer  having  some  properties. 
For  the  details,  refer  to  Bai,  Rao  and  Wu  (1992). 
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